Effectively Creating Weakly Labeled Training Examples via Approximate Domain Knowledge
نویسندگان
چکیده
One of the challenges to information extraction is the requirement of human annotated examples, commonly called gold-standard examples. Many successful approaches alleviate this problem by employing some form of distant supervision, i.e., look into knowledge bases such as Freebase as a source of supervision to create more examples. While this is perfectly reasonable, most distant supervision methods rely on a hand-coded background knowledge that explicitly looks for patterns in text. For example, they assume all sentences containing Person X and Person Y are positive examples of the relation married(X, Y). In this work, we take a different approach – we infer weakly supervised examples for relations from models learned by using knowledge outside the natural language task. We argue that this method creates more robust examples that are particularly useful when learning the entire information-extraction model (the structure and parameters). We demonstrate on three domains that this form of weak supervision yields superior results when learning structure compared to using distant supervision labels or a smaller set of gold-standard labels.
منابع مشابه
Exploiting weakly-labeled Web images to improve object classification: a domain adaptation approach
Most current image categorization methods require large collections of manually annotated training examples to learn accurate visual recognition models. The time-consuming human labeling effort effectively limits these approaches to recognition problems involving a small number of different object classes. In order to address this shortcoming, in recent years several authors have proposed to le...
متن کاملKnowledge Transfer from Weakly Labeled Audio using Convolutional Neural Network for Sound Events and Scenes
In this work we propose approaches to effectively transfer knowledge from weakly labeled web audio data. We first describe a convolutional neural network (CNN) based framework for sound event detection and classification using weakly labeled audio data. Our model trains efficiently from audios of variable lengths; hence, it is well suited for transfer learning. We then propose methods to learn ...
متن کاملTraining Object Detection Models with Weakly Labeled Data
Appearance based object detection systems utilizing statistical models to capture real world variations in appearance have been shown to exhibit good detection performance. The parameters of these statistical models are typically learned automatically from labeled training images. This process can be difficult in that a large number of labeled training examples may be needed to accurately model...
متن کاملSemi-Supervised Training of Models for Appearance-Based Statistical Object Detection Methods
Appearance-based object detection systems using statistical models have proven quite successful. They can reliably detect textured, rigid objects in a variety of poses, lighting conditions and scales. However, the construction of these systems is time-consuming and difficult because a large number of training examples must be collected and manually labeled in order to capture variations in obje...
متن کاملNumerical Solution of Weakly Singular Ito-Volterra Integral Equations via Operational Matrix Method based on Euler Polynomials
Introduction Many problems which appear in different sciences such as physics, engineering, biology, applied mathematics and different branches can be modeled by using deterministic integral equations. Weakly singular integral equation is one of the principle type of integral equations which was introduced by Abel for the first time. These problems are often dependent on a noise source which a...
متن کامل